Skip to content

Conversation

@Potabk
Copy link
Collaborator

@Potabk Potabk commented Jul 28, 2025

What this PR does / why we need it?

Currently our workflow run time takes about 3 hours in total, which seriously affects the developer experience, so it is urgent to have a optimization, after this pr, It is expected that the running time of the full CI can be shortened to 1h40min.

  • Enable linux-aarch64-a2 (64GB) to replace linux-arm64-npu (32GB)
  • Change TP4 ---> TP2 * 2 max-parallel
  • Move DeepSeek-V2-Lite-W8A8 to single card test

Does this PR introduce any user-facing change?

No

How was this patch tested?

Potabk added 8 commits July 28, 2025 17:52
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
@codecov
Copy link

codecov bot commented Jul 28, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.83%. Comparing base (935e9d4) to head (4f04bfd).
⚠️ Report is 616 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2065      +/-   ##
==========================================
- Coverage   73.85%   73.83%   -0.03%     
==========================================
  Files         103       96       -7     
  Lines       11425    10865     -560     
==========================================
- Hits         8438     8022     -416     
+ Misses       2987     2843     -144     
Flag Coverage Δ
unittests 73.83% <ø> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: wangli <wangli858794774@gmail.com>
@Yikun Yikun changed the title [CI] Speed up CI [CI] Enable linux-aarch64-a2 (64GB) and change tp4 --> tp2 * 2 max-parallel to speed up CI Jul 29, 2025
@Yikun Yikun changed the title [CI] Enable linux-aarch64-a2 (64GB) and change tp4 --> tp2 * 2 max-parallel to speed up CI [CI] Enable linux-aarch64-a2 (64GB) and tp2 * 2 max-parallel to speed up CI Jul 29, 2025
@Yikun
Copy link
Collaborator

Yikun commented Jul 29, 2025

cc @wangxiyuan @ganyi1996ppo @jianzs @ApsarasX @zzzzwwjj @yiz-liu @whx-sjtu @Angazenn @mengwei805

FYI, after this PR we will use A2 (64GB) in CI

@Yikun
Copy link
Collaborator

Yikun commented Jul 29, 2025

https://github.com/vllm-project/vllm-ascend/blob/main/benchmarks/scripts/run_accuracy.py
https://github.com/vllm-project/vllm-ascend/blob/main/tests/e2e/multicard/test_offline_inference_distributed.py

Seems also need change?

And many func should be added on yaml, this can be done in new PR

def test_models_distributed_pangu():
def test_models_distributed_topk() -> None:
def test_models_distributed_Qwen3_W8A8():

Should be included

Signed-off-by: wangli <wangli858794774@gmail.com>
@Yikun Yikun added accuracy-test enable all accuracy test for PR ready-for-test start test by label for PR labels Jul 29, 2025
@wangxiyuan wangxiyuan merged commit f60bb47 into vllm-project:main Jul 29, 2025
32 checks passed
@Potabk Potabk deleted the ci_opt branch July 29, 2025 11:34
weijinqian0 pushed a commit to weijinqian0/vllm-ascend that referenced this pull request Jul 30, 2025
… up CI (vllm-project#2065)

### What this PR does / why we need it?
Currently our workflow run time takes about 3 hours in total, which
seriously affects the developer experience, so it is urgent to have a
optimization, after this pr, It is expected that the running time of the
full CI can be shortened to 1h40min.

- Enable linux-aarch64-a2 (64GB) to replace linux-arm64-npu (32GB)
- Change TP4 ---> TP2 * 2 max-parallel
- Move DeepSeek-V2-Lite-W8A8 to single card test

### Does this PR introduce _any_ user-facing change?
No


- vLLM version: v0.10.0
- vLLM main:
vllm-project/vllm@a248025

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
weijinqian0 pushed a commit to weijinqian0/vllm-ascend that referenced this pull request Jul 30, 2025
… up CI (vllm-project#2065)

### What this PR does / why we need it?
Currently our workflow run time takes about 3 hours in total, which
seriously affects the developer experience, so it is urgent to have a
optimization, after this pr, It is expected that the running time of the
full CI can be shortened to 1h40min.

- Enable linux-aarch64-a2 (64GB) to replace linux-arm64-npu (32GB)
- Change TP4 ---> TP2 * 2 max-parallel
- Move DeepSeek-V2-Lite-W8A8 to single card test

### Does this PR introduce _any_ user-facing change?
No

- vLLM version: v0.10.0
- vLLM main:
vllm-project/vllm@a248025

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>
wangxiyuan pushed a commit that referenced this pull request Sep 8, 2025
### What this PR does / why we need it?
Switch Infra to linux-aarch64-a2 and python to 3.11

Soft backport: #2065
Soft backport: #2072

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed
search all: `linux-arm64-npu` and `3.10`

---------

Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
chopper0126 pushed a commit to chopper0126/vllm-ascend that referenced this pull request Sep 26, 2025
… up CI (vllm-project#2065)

### What this PR does / why we need it?
Currently our workflow run time takes about 3 hours in total, which
seriously affects the developer experience, so it is urgent to have a
optimization, after this pr, It is expected that the running time of the
full CI can be shortened to 1h40min.

- Enable linux-aarch64-a2 (64GB) to replace linux-arm64-npu (32GB)
- Change TP4 ---> TP2 * 2 max-parallel
- Move DeepSeek-V2-Lite-W8A8 to single card test

### Does this PR introduce _any_ user-facing change?
No


- vLLM version: v0.10.0
- vLLM main:
vllm-project/vllm@a248025

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
Angazenn pushed a commit to Angazenn/vllm-ascend that referenced this pull request Oct 21, 2025
… up CI (vllm-project#2065)

### What this PR does / why we need it?
Currently our workflow run time takes about 3 hours in total, which
seriously affects the developer experience, so it is urgent to have a
optimization, after this pr, It is expected that the running time of the
full CI can be shortened to 1h40min.

- Enable linux-aarch64-a2 (64GB) to replace linux-arm64-npu (32GB)
- Change TP4 ---> TP2 * 2 max-parallel
- Move DeepSeek-V2-Lite-W8A8 to single card test

### Does this PR introduce _any_ user-facing change?
No


- vLLM version: v0.10.0
- vLLM main:
vllm-project/vllm@a248025

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

accuracy-test enable all accuracy test for PR module:tests ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants